巴西专利BR112012002839B1 bandwidth extension method, bandwidth extension device, integrated circuit and audio decoding device

专利PDF首页>>巴西专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
BANDWIDTH EXTENSION METHOD, BANDWIDTH EXTENSION APPLIANCE, PROGRAM, INTEGRATED CIRCUIT AND AUDIO DECODING APPLIANCE. Provide a bandwidth extension method, which allows the reduction of the amount of computation in the bandwidth extension and suppression of quality deterioration in the bandwidth to be extended. In the bandwidth extension method: a low frequency bandwidth signal is transformed into a QMF domain for the generation of a first low frequency QMF spectrum (S11); step-changed signals are generated by applying different factors of change to the low frequency bandwidth signal (S12); a high frequency QMF spectrum is generated by the time extension of the changed pitch signals in the QMF domain (S13); the high frequency QMF spectrum is modified (S14); and the modified high frequency QMF spectrum is combined with the first low frequency QMF spectrum (S15).
公开号:BR112012002839B1
申请号:R112012002839-1
申请日:2011-06-06
公开日:2020-10-13
发明作者:Tomokazu Ishikawa；Takeshi Norimatsu；Huan Zhou；Kok Seng CHONG；Haishan Zhong
申请人:Panasonic Intellectual Property Corporation Of America；
IPC主号:

专利说明:

[Technical Field]
[001] The present invention relates to a method of extending bandwidth for extending a frequency bandwidth of an audio signal. [Background Technique]
[002] Audio bandwidth extension (BWE) technology is typically used in modern audio encoders - decoders for efficient encoding of a broadband audio signal at a low bit rate. Its principle is to use a parametric representation of the original high frequency (HF) content to synthesize an approximation of the HF to the lower frequency data (LF).
[003] Figure 1 is a diagram showing an audio encoder - decoder like this based on BWE technology. In its encoder, a broadband audio signal is first separated (101 and 103) into a part of LF and HF; its LF part is encoded (104) in a shape that preserves the waveform; meanwhile, the relationship between your LF part and the HF part is analyzed (102) (typically in the frequency domain) and described by a set of HF parameters. Due to the parameter description of the HF part, the multiplexed waveform data (105) and the HF parameters can be transmitted to the decoder at a low bit rate.
[004] In the decoder, the LF part is first decoded (107). To approximate the original HF part, the LF part is transformed (108) into a frequency domain, the resulting LF spectrum is modified (109) to generate an HF spectrum, under the guidance of some parameters of Decoded HF. The HF spectrum is further refined (110) by post-processing, also under the guidance of some decoded HF parameters. The refined HF spectrum is converted (111) into a time domain and combined with the delayed LF portion (112). As a result, the final reconstructed broadband audio signal is extracted.
[005] Note that in BWE technology, an important step is to generate an HF spectrum from the LF spectrum (109). There are a few ways to accomplish this, such as copying the LF portion to the HF location, non-linear processing, or increasing sampling.
[006] A well-known audio encoder - decoder that uses a BWE technology like that is MPEG-4 HE-AAC, where the BWE technology is specified as an SBR (spectral band replication) or SBR technology, wherein the HF portion is generated simply by copying the LF portion in the QMF representation to the HF spectral location.
[007] A spectral copy operation like this, also called patching, is simple and has been proven to be efficient in most cases. However, at very low bit rates (for example, <20 kbits / s mono), where only small LF part bandwidths are possible, this SBR technology can lead to unwanted auditory artifact sensations, such as harshness and an unpleasant timbre (for example, see Non-Patent Literature (NPL) 1).
[008] Therefore, to avoid these artifacts resulting from a mirroring or copying operation presented in a low bit rate encoding scenario, the standard SBR technology is improved and extended with the following major changes (for example, see NPL 2): (1) to modify the splitting algorithm from a copy pattern to a splitting pattern commanded by a phase voice encoder (2) to increase the adaptive time resolution for post-processing parameters.
[009] As a result of the first modification ((1) mentioned above), by spreading the LF spectrum with multiple integers, harmonic continuity in HF is intrinsically ensured. In particular, no feeling of unwanted harshness, due to knocking effects, can emerge at the boundary between a low frequency and a high frequency and between different parts of high frequency (for example, see NPL 1).
[0010] And the second modification ((2) mentioned above) makes it easier for the refined HF spectrum to be more adaptive to signal fluctuations in the replicated frequency bands.
[0011] As the new installment preserves the harmonic relationship, it is called an extension of harmonic bandwidth (HBE). The advantages of HBE over standardized SBR have also been confirmed experimentally for low bit rate audio encoding (for example, see NPL 1).
[0012] Note that the two modifications above only affect the HF spectrum generator (109), the remaining processes in HBE being identical to those in SBR.
[0013] Figure 2 is a diagram showing the HF spectrum generator in HBE of the prior art. It should be noted that the HF spectrum generator includes a TF 108 transform and a HF 109 reconstruction. Given an LF portion of a signal, suppose your HF spectrum is composed of harmonic (Tl) patches from HF (each installment process produces an HF portion), from the 2nd order (the HF portion with the lowest frequency) to the T-th order (the HF portion with the highest frequency). In the prior art HBE, all of these HF plots are independently generated in parallel derived from phasis speech encoders.
[0014] As shown in figure 2, (T-l) phase voice encoders (201 to 203) with different strain factors (from 2 to k) are used for the strain of the input LF part. The stretched outlets with different lengths are filtered with bandpass (204 to 206) and resampled (207 to 209) to generate HF plots by converting a time dilation into a frequency extension. By adjusting the strain factor as twice the resampling factor, the HF plots maintain the harmonic structure of the signal and are twice as long as the LF portion. Then, all HF plots have the delay aligned (210 to 212) to compensate for potential different delay contributions from the resampling operation. In the last step, all the HF plots of delayed alignment are added and transformed (213) into the QMF domain for the production of the HF spectrum.
[0015] When looking at the HF spectrum generator, it has a high amount of computation. The amount of computation mainly comes from the time stretching operation, performed by a series of short-time Fourier transform (STFT) and short-time inverse Fourier transform (ISTFT) transforms adopted in phase speech encoders, and the successful QMF operation applied to the time-stretched HF part.
[0016] A general introduction to the phase speech encoder and the QMF transform is described as below.
[0017] A phase speech encoder is a well-known technique that uses frequency domain transformations to implement a time distending effect. That is, for the modification of a signal's temporal evolution, while its local spectral characteristics are kept unmodified. Its basic principle is described below.
[0018] Figure 3A and figure 3B are diagrams showing the basic principle of time extension performed by the phase speech encoder.
[0019] Divide the audio into overlapping blocks and respawn these blocks where the jump size (the time interval between successive blocks) is not the same at the entrance and at the exit, as shown in figure 3A. There, the input hop size Ra is smaller than the output hop size is smaller than the output hop size Rs, as a result, the original signal is stretched with a rate 77 shown in (Equation 1) below .
(Equation 1)
[0020] As shown in figure 3B, the re-spaced blocks are superimposed in a coherent pattern, which requires a frequency domain transform. Typically, the input blocks are transformed into frequency, after an appropriate phase change, the new blocks are transformed back into output blocks.
[0021] Following the above principle, the most classic phase voice encoders adopt a short time Fourier transform (STFT) as the frequency domain transform, and involve an explicit sequence of analysis, modification and new synthesis for distension of time.
[0022] QMF banks transform time-domain representations into time-frequency representations together (and vice versa), which are typically used in parametric based coding schemes, such as spectral band replication (SBR ), parametric stereo coding (PS) and spatial audio coding (SAC), etc. A characteristic of these filter banks is that the frequency domain (subband) signals of complex value are effectively oversampled by a factor of two. This allows post-processing operations of subband domain signals without introducing a jagged distortion.
[0023] In greater detail, given a discrete time signal of real value x (n), with the QMF bank of analysis, the sub-band domain signals of complex value Sk (n) are obtained through (Equation 2) below.
(Equation 2)
[0024] In (Equation 2), p (n) represents a low-pass prototype filter impulse response of order Ll, α represents a phase parameter, M represents the number of bands eko subband index with k = 0, 1, ..., Ml.
[0025] Note that, like STFT, a QMF transform is also a joint time-frequency transform. This means that it provides a frequency content of a signal and the change in frequency content over time, where the frequency content is represented by the frequency subband and the timeline is represented by a time interval, respectively .
[0026] Figure 4 is a diagram showing a QMF analysis and a synthesis scheme.
[0027] In detail, as shown in figure 4, a given real audio input is divided into successive overlapping blocks with length of L and jump size M (figure 4 (a)), the QMF analysis process transforms each block in a time interval, consisting of M complex subband signals. Through this, the L time domain input samples are transformed into L complex QMF coefficients, composed of L / M time intervals and M sub-bands (figure 4 (b)). Each time interval, combined with the previous (L / M-1) time intervals, is synthesized by the QMF synthesis process for the reconstruction of the M real time domain samples (figure 4 (c)) with an almost perfect reconstruction. [Citation List] [Non-Patent Literature]
[0028] [NPL 1] Frederik Nagel and Sascha Disch, 'A harmonic bandwidth extension method for audio codecs', IEEE Int. Conf, on Acoustics, Speech and Signal Proc., 2009.
[0029] [NPL 2] Max Neuendorf, et al, 'A novel scheme for low bitrate unified speech and audio coding - MPEG RM0', at 126th AES Convention, Munich, Germany, May 2009. [Summary of the Invention] [Technical problem]
[0030] A problem associated with the prior art HBE technology is the amount of high computation. A traditional phase speech encoder that is adopted by HBE for signal distension has a higher computation amount, because of the application of successive FFTs and IFFTs, that is, successive FFTs (fast Fourier transforms) and IFFTs (transforms of Inverse fast fourier); and the successive QMF transform increases the amount of computation by being applied to the time-extended signal. Furthermore, in general, an attempt to reduce the amount of computation leads to the potential problem of quality degradation.
[0031] Thus, the present invention was conceived in view of the problem mentioned above and aims to provide a method of extending bandwidth capable of reducing the amount of computation over an extension of bandwidth, as well as suppressing a quality deterioration in extended bandwidth. [Solution to the Problem]
[0032] In order to achieve the aforementioned objective, the bandwidth extension method according to one aspect of the present invention is a bandwidth extension method for producing a full bandwidth signal from of a low frequency bandwidth signal, the method including: a first transformation step of transforming the low frequency bandwidth signal into a quadrature mirror filter bank (QMF) domain to generate a first low frequency QMF spectrum; a step change step of generating changed step signals by applying different change factors to the low frequency bandwidth signal; a high frequency generation step of generating a high frequency QMF spectrum by the time distension of the changed step signals in a QMF domain; a spectrum modification step of modifying the high frequency QMF spectrum to satisfy high frequency energy and hue conditions; and a full bandwidth generation step of generating the full bandwidth signal by combining the modified high frequency QMF spectrum with the first low frequency QMF spectrum.
[0033] Therefore, the high frequency QMF spectrum is generated by the time distension of the changed pitch signals in the QMF domain. Therefore, it is possible to avoid conventional complex processing (successively repeated FFTs and IFFTs, and a subsequent QMF transform), for the generation of the high frequency QMF spectrum, and thus the amount of computation can be reduced. Note that, like STFT, the QMF transform itself provides a joint time-frequency resolution, thus the QMF transform replacing the STFT and ISTFT series. In addition, in the bandwidth extension method in accordance with an aspect of the present invention, the changed pitch signals are generated by the application of mutually different coefficients of change, rather than just a coefficient of change, and a time stretch is performed on these signals, it is possible to suppress a quality deterioration of the high frequency QMF spectrum.
[0034] Furthermore, the high frequency generation step includes: a second transform step transforming the changed pitch signals in a QMF domain for the generation of QMF spectra; a step of generating a harmonic plot of generation of the QMF spectra along a temporal dimension with different stretching factors for the generation of harmonic plots; a time alignment step of the harmonic plots; and a summation step of the harmonic plots aligned over time.
[0035] Furthermore, the harmonic plot generation step includes: a step of calculating the amplitude and phase of a QMF spectrum among the QMF spectra; a phase manipulation phase of the phase manipulation for the production of a new phase; and a step of generating the QMF coefficient of combining the amplitude with the new phase for the generation of a new set of QMF coefficients.
[0036] Furthermore, in the phase manipulation stage, the new phase is produced based on an original phase of an entire set of QMF coefficients.
[0037] Furthermore, in the phase manipulation step, a manipulation is performed repeatedly for sets of QMF coefficients and, in the QMF coefficient generation step, new sets of QMF coefficients are generated.
[0038] Furthermore, in the phase manipulation stage, a different manipulation is performed, depending on a QMF subband index.
[0039] Furthermore, in the QMF coefficient generation stage, the new sets of QMF coefficients are added with superposition to generate the QMF coefficients corresponding to a temporally extended audio signal.
[0040] Specifically, the time extension in the bandwidth extension method in accordance with an aspect of the present invention mimics the STFT-based method of stretching by modifying the phases of incoming QMF blocks and overlapping the QMF blocks modified with a different hop size. From the point of view of computation quantity, a comparison with successive FFTs and IFFTs in a STFT-based method, this time extension has a lower computation quantity due to the involvement of only one QMF analysis transform only. Therefore, it is possible to further reduce the amount of computing in the bandwidth extension.
[0041] Furthermore, in order to achieve the objective mentioned above, the bandwidth extension method in another aspect of the present invention is a bandwidth extension method for producing a full bandwidth signal. from a low frequency bandwidth signal, the method including: a first transformation step of transforming the low frequency bandwidth signal into a domain of quadrature mirror filter bank (QMF) for generation a first low frequency QMF spectrum; a step of generating a low-order harmonic portion of generating a low-order harmonic portion by time distancing the low frequency bandwidth signal in a QMF domain; a high frequency generation stage of (i) generation of signals that are changed pitch, by applying different coefficients of change to the low order harmonic portion, and (ii) generation of a high frequency QMF spectrum from the signals; a spectrum modification step of modifying the high frequency QMF spectrum to satisfy high frequency energy and hue conditions; and a full bandwidth generation step of generating the full bandwidth signal by combining the modified high frequency QMF spectrum with the first low frequency QMF spectrum.
[0042] Therefore, the high frequency QMF spectrum is generated by the time extension and the change of pitch of the low frequency bandwidth signal in the QMF domain. Therefore, it is possible to avoid conventional complex processing (FTTs and IFFTs repeated successively, and a subsequent QMF transform), for the generation of the high frequency QMF spectrum, and thus the amount of computation can be reduced. In addition, since the step change signals are generated by applying mutually different change coefficients instead of just one change coefficient, and the high frequency QMF spectrum is generated from these signals, it is possible to suppress a deterioration quality of the high frequency QMF spectrum. Furthermore, since the high frequency QMF spectrum is generated from the low order harmonic portion, it is possible to further suppress the deterioration in the quality of the high frequency QMF spectrum.
[0043] It should be noted that, in the bandwidth extension method according to another aspect of the present invention, the step change also operates in the domain of QMF. This is in order to decompose the LF QMF sub-band into the low order portion into multiple sub-bands for a higher frequency resolution, then mapping those sub-bands into a high QMF sub-band to the generation of the high order parcel spectrum.
[0044] Furthermore, the low-order harmonic plot generation step includes: a second low-frequency bandwidth transform transform step into a second low-frequency QMF spectrum; a bandpass pass step of the second low frequency QMF spectrum; and a stretch step of stretching the second low frequency band QMF spectrum passed over a temporal dimension.
[0045] Furthermore, the second low frequency QMF spectrum has a finer frequency resolution than the first low frequency QMF spectrum.
[0046] Furthermore, the high frequency generation stage includes: a stage for the generation of the bandpass portion of the low-order harmonic portion for the generation of past band portions; a high-order generation step of mapping each of the high frequency band plots to generate high-order harmonic plots; and a summation step of the high order harmonic plots with the low order harmonic plots.
[0047] Furthermore, the high order generation step includes: a step of dividing the division of each sub-band of QMF in each of the band portions passed in multiple sub-bands; a mapping step of mapping sub-bands into high-frequency QMF sub-bands; and a combining step of combining results from the subband mapping.
[0048] Furthermore, the mapping step includes: a step of dividing the division of the sub-bands of each of the sub-bands of QMF into a part of band and a part of bandpass; a computation step of computation of center frequencies transposed from the sub-bands in the part of the band pass with a factor dependent on the order of the parcel; a first stage of mapping the mapping of sub-bands in the band-pass part in sub-bands of high frequency QMF according to the center frequencies; and a second stage of mapping the mapping of sub-bands in the band to high-frequency QMF sub-bands, according to the sub-bands of the band-pass part.
[0049] It should be noted that, in the bandwidth extension method according to the present invention, the process operations (steps) described above can be combined in any way.
[0050] A bandwidth extension method according to that according to the present invention is a low computing quantity HBE technology which uses a reduced computing quantity HF spectrum generator, which contributes to the quantity of higher computation for HBE. For the reduction of the amount of computation, in the method of extending bandwidth according to an aspect of the present invention, a new phase speech encoder based on QMF that performs a time extension in the domain of QMF with an amount of computation low is used. Furthermore, in the bandwidth extension method according to another aspect of the present invention, to avoid possible quality problems associated with the solution, a new step change algorithm is used, which generates high order harmonic plots to from a low order parcel in a QMF domain.
[0051] It is an objective of this invention to design a QMF-based plot in which a time extension or a time extension and frequency extension can be performed in a QMF domain, to lengthen it, for the block diagram of a low computing quantity HBE technology driven by a QMF-based phase voice encoder.
[0052] It should be noted that the present invention can be realized, not only as a bandwidth extension method, but also as a bandwidth extension apparatus and an integrated circuit that extends the frequency bandwidth of an audio signal using the bandwidth extension method, as a program to cause a computer to extend a frequency bandwidth using the bandwidth extension method, and as a recording medium in which the program is recorded. [Advantageous Effect of the Invention]
[0053] The bandwidth extension method in the present invention designs a new harmonic bandwidth extension technology (HBE). The core of the technology is to do a time extension or a time extension and a change of pace in the QMF domain, rather than in a traditional FFT domain and a time domain, respectively. Compared to the prior art HBE technology, the bandwidth extension method in the present invention can provide good sound quality and significantly reduce the computational amount. [Brief Description of Drawings]
[0054] [FIG. 1] Figure 1 is a diagram showing an audio encoder - decoder scheme using normal BWE technology.
[0055] [FIG. 2] Figure 2 is a diagram showing a preserved harmonic structure HF spectrum generator
[0056] [FIG. 3A] Figure 3A is a diagram showing the principle of time extension by re-spacing audio blocks.
[0057] [FIG. 3B] Figure 3B is a diagram showing the principle of time extension by re-spacing audio blocks.
[0058] [FIG. 4] Figure 4 is a diagram showing a QMF analysis and synthesis scheme.
[0059] [FIG. 5] Figure 5 is a flowchart showing a method of extending bandwidth in a first embodiment of the present invention.
[0060] [FIG. 6] Figure 6 is a diagram showing an HF spectrum generator in the first embodiment of the present invention.
[0061] [FIG. 7] Figure 7 is a diagram showing an audio decoder in the first embodiment of the present invention.
[0062] [FIG. 8] Figure 8 is a diagram showing a scheme of changing the time scale of a signal based on a QMF transform in the first embodiment of the present invention.
[0063] [FIG. 9] Figure 9 is a diagram showing a QMF domain time stretching method in the first embodiment of the present invention.
[0064] [FIG. 10] Figure 10 is a diagram showing a comparison of the strain effects for a sinusoidal tonal signal with different strain factors.
[0065] [FIG. 11] Figure 11 is a diagram showing an energy misalignment and spreading effect in an HBE scheme.
[0066] [FIG. 12] Figure 12 is a flow chart showing the method of extending bandwidth in a second embodiment of the present invention.
[0067] [FIG. 13] Figure 13 is a diagram showing an HF spectrum generator in the second embodiment of the present invention.
[0068] [FIG. 14] Figure 14 is a diagram showing an audio decoder in the second embodiment of the present invention.
[0069] [FIG. 15] Figure 15 is a diagram showing a frequency extension method in the QMF domain in the second embodiment of the present invention.
[0070] [FIG. 16] Figure 16 is a figure showing a distribution of subband spectra in the second embodiment of the present invention.
[0071] [FIG. 17] Figure 17 is a diagram showing the relationship between the bandpass component and the paraband component for a sine wave in a complex QMF domain in the second embodiment of the present invention. [Description of Modalities]
[0072] The following modalities are merely illustrative for the principles of several inventive stages. It is understood that variations in the details described here will be evident to others skilled in the art. (First modality)
[0073] From this point on, an HBE scheme (harmonic bandwidth extension method) and a decoder (audio decoder or audio decoding device) using the same, in the present invention, will be described.
[0074] Figure 5 is a flow chart showing the method of extending bandwidth in the present modality.
[0075] This bandwidth extension method is a bandwidth extension method for producing a full bandwidth signal from a low frequency bandwidth signal, the method including: a first transform step of transforming the low frequency bandwidth signal into a quadrature mirror filter bank (QMF) domain for the generation of a first low frequency QMF spectrum; a step change step of generating step change signals by applying different change factors; a high frequency generation step of generating a high frequency QMF spectrum by the time distension of the changed step signals in a QMF domain; a spectrum modification step of modifying the high frequency QMF spectrum to satisfy the high frequency and hue energy conditions; and a full bandwidth generation step of generating the full bandwidth signal by combining the modified high frequency QMF spectrum with the first low frequency QMF spectrum.
[0076] It should be noted that the first transform step (Sll) is performed by a TF 1406 transform unit to be described later, the step change step (S12) is performed by sampling units 504 to 506 and a resampling unit 1403 to be described later. In addition, the generically high generation step (S13) is carried out by the QMF transform units 507 to 509, the phase speech encoders 510 to 512, a QMF 404 transform unit, and a time stretch unit 1405 to be described later. Furthermore, the step of generating full bandwidth (sl5) is carried out by an addition unit 1410 to be described later.
[0077] Furthermore, the high frequency generation step includes: a second transform transform step of the changed pitch signals in a QMF domain for the generation of QMF spectra; a step of generating a harmonic stretch of the QMF spectra along a temporal dimension with different stretching factors for the generation of harmonic plots; a time alignment step of the harmonic plots; and a summation step of summation of all harmonic plots aligned over time.
[0078] It should be noted that the second transform step is performed by the QMF 507 to 509 transform units and the QMF 1404 transform unit, and the harmonic portion generation step is performed by the 510 to phase voice encoders 512 and time extension unit 1405. Furthermore, the alignment step is performed by the delay alignment units 513 to 515 to be described, and the summation step is performed by the addition unit 516 to be described later.
[0079] In an HBE scheme in the present modality, an HF spectrum generator in an HBE technology is designed with the time domain step change processes, followed by the time extension processes commanded by a voice encoder in a QMF domain.
[0080] Figure 6 is a diagram showing the HF spectrum generator used in the HBE scheme in the present modality. The HF spectrum generator includes: bandpass units 501, 502, ..., and 503; sampling units 504, 505, ..., and 506; the QMF transform units 507, 508, ..., and 509; the phase speech encoders 510, 511, ..., and 512; delay alignment units 513, 514, ..., and 515; and the addition unit 516.
[0081] A given LF bandwidth entry is first passed in the band (501 to 503) and resampled (504 to 506) to generate its HF bandwidth portions. Those portions of HF bandwidth are transformed (507 to 509) into the QMF domain, the resulting QMF outputs are stretched in time (510 to 512) with strain factors like twice, according to the resampling factors. The extended HF spectra have the delay aligned (513 to 515) to compensate for the different potential delay contributions from the resampling process and added (516) for the generation of the final HF spectrum. It should be noted that each of the numbers 501 to 516 in parentheses above denotes a constituent element of the HF spectrum generator.
[0082] Comparing the scheme of the present modality with the scheme of the previous technique (figure 2), it can be seen that the main differences are 1) more QMF transformations are applied; and 2) a time extension operation is performed in the QMF domain, not in the FFT domain. The detailed QMF time extension operation will be described in more detail later.
[0083] Figure 7 is a diagram showing a decoder adopting the HF spectrum generator of the present modality. The decoder (audio decoding device) includes a demultiplexing unit 1401, a decoding unit 1402, the time resampling unit 1403, the QMF transform unit 1404 and the time extension unit 1405. It should be noted that , in the present embodiment, the demultiplexing unit 1401 corresponds to the separation unit which separates an encoded low frequency bandwidth signal from the encoded information (bit stream). Furthermore, the TF transformer unit 1409 corresponds to the reverse transformer unit which transforms a full bandwidth signal from a quadrature mirror filter bank (QMF) domain signal to a domain signal of time.
[0084] With the decoder, the bit stream is demultiplexed (1401) first, the signal LF part is then decoded (1402). To approximate the original HF part, the decoded LF part (low frequency bandwidth signal) is resampled (1403) in the time domain for generating the HF part, the resulting HF part is transformed (1404 ) in a QMF domain, the resulting HF QMF spectrum is stretched (1405) along the temporal direction, the stretched HF spectrum is further refined (1408) by post-processing, under the guidance of some HF parameters decoded. Meanwhile, the decoded LF part is also transformed (1406) into a QMF domain. In the end, the refined HF spectrum is combined (1410) with the delayed LF spectrum (1407) to produce the full bandwidth QMF spectrum. The resulting full bandwidth QMF spectrum is converted (1409) back to the time domain for the extraction of the decoded broadband audio signal. It should be noted that each of the numbers 1401 to 1410 in parentheses above denotes a constituent element of the decoder. The Time Stretch Method
[0085] The time extension process of the HBE scheme in the present modality, for example, is for an audio signal, its time-extended signal can be generated by a QMF transform, phase manipulations and an inverse QMF transform . Specifically, the harmonic plot generation step includes: a step for calculating the amplitude and phase of a QMF spectrum among the QMF spectra; a phase manipulation phase of the phase manipulation for the production of a new phase; and a step of generating the QMF coefficient of combining the amplitude with the new phase for the generation of a new set of QMF coefficients. It should be noted that each of the calculation step, the phase manipulation step, and the QMF coefficient generation step is performed by a module 702 to be described later.
[0086] Figure 8 is a diagram showing a QMF-based time stretching process performed by the QMF 1404 transform unit and the 1405 time stretching unit. First, an audio signal is transformed into a set of QMF coefficients, say X (m, n), by a QMF analysis transform (701). These QMF coefficients are modified in module 702. In which, for each QMF coefficient, their amplitude r to phase a are calculated, say, X (m, n) = r (m, n) • exp (j • a (m , n)). As (manipulated) for ~ (m, n). and the original amplitudes r of QMF coefficients. By QMF coefficients is shown in (Equation 3) below.
(Equation 3)
[0087] Finally, a new set of QMF coefficients is transformed (703) into a new audio signal, corresponding to the original audio signal with a modified time scale.
[0088] The QMF-based time stretching algorithm in the HBE scheme in the present modality mimics the STFT-based stretching algorithm: 1) the modification stage uses the concept of instantaneous frequency to modify the phases; 2) to reduce the amount of computation, the superimposed addition is performed in a QMF domain using the additivity property of a QMF transform.
[0089] Below is the detailed description of the time extension algorithm in the HBE scheme in the present modality.
[0090] Assuming that there are 2L real-time time domain signals x (n) to be stretched with a strain factor s, after the QMF analysis stage, there are 2L complex QMF coefficients, consisting of 2L / M intervals of time and M sub-bands.
[0091] Note that as the STFT-based strain method, the transformed QMF coefficients are optionally subjected to a placement of analysis windows, before phase manipulation. In this invention, this can be done in the time domain or in the QMF domain.
[0092] In the time domain, a time domain signal can naturally be placed in a window as in (Equation 4) below.
(Equation 4)
[0093] The mod (.) In (Equation 4) means a modulation operation.
[0094] In the QMF domain, the equivalent operation can be performed by:
[0095] 1) Transform of the analysis window h (n) (with length of L) in the domain of QMF for the production of H (v, k) with L / M time intervals and M sub-bands.
[0096] 2) Simplification of the window's QMF representation as shown in (Equation 5) below.
(Equation 5)
[0097] Here, v = 0, ..., L / M-1.
[0098] 3) Realization of the placement in the analysis window in the QMF domain by X (m, k) = X (m, k) • Ho (w) where w = mod (m, L / M) (must be noted that mod (.) means a modulation operation.
[0099] Furthermore, in the HBE scheme in the present modality, in the phase manipulation stage, the new phase is produced based on an original phase of an entire set of QMF coefficients. Specifically, in the present modality, as a detailed realization of time stretching, a phase manipulation is performed based on the QMF block.
[00100] Figure 9 is a diagram of a time extension method in the QMF domain.
[00101] These original QMF coefficients can be treated as L + l overlapping QMF blocks with a jump size of 1 time interval and a block length of L / M time intervals, as shown in (a) in the figure 9.
[00102] To ensure that there is no phase jump effect, each original QMF block is modified to generate a new QMF block with modified phases and phases of the new QMF blocks must be continuous at point ps for the superposition of the ( p) -th and (p + l) -th new QMF block, which is equivalent to continuous at the pMs (pCN) joint points in the time domain.
[00103] Furthermore, in the HBE scheme in the present modality, in the phase manipulation stage, a manipulation is performed repeatedly for sets of QMF coefficients, and in the stage of QMF coefficient generation, new sets of QMF coefficients are generated. In this case, the phases are modified based on the block following the criteria above.
[00104] Assuming that the original phases are <pu (k) for the data coefficients of QMF X (u, k), for u = 0, ..., 2L / M- 1 and k = 0, ..., Ml . Each QMF block is sequentially modified to a new QMF block, as shown in (b) in Figure 9, where new QMF blocks are illustrated with different fill patterns.
[00105] Next, ψu (n) (k) represents phase information of the umpteenth new QMF block for n = l, ..., L / M, u = 0, ..., L / Ml and k = 0, 1, ..., Ml. These new phases, depending on whether the new block is repainted or not, are designed as follows.
[00106] Assume that the new QMF block X (1) (u, k) (u = 0, ..., L / M-l) is not re-spaced. So, the new phase information ψu (li (k) is identical to tpu (k). That is, ψu (1) (k) = cpu (k) for u = 0, ..., L / Ml and k = 0, 1, ..., Ml.
[00107] For the 2nd new QMF block X (2) (u, k) (u = 0, ..., L / Ml), it is re-spaced with the jump size of s time interval (for example, 2 time intervals, as shown in figure 9). In this case, the instantaneous frequencies at the beginning of the block must be consistent with those in the seventh time interval in the new QMF block X (1> (u, k). Thus, the instantaneous frequencies for the first time interval of Xí2J (u, k) must be identical to those for the 2nd time slot in the original QMF block.

[00108] Furthermore, since the phases for the 1st time interval are changed, the remaining phases are adjusted accordingly to preserve the original instantaneous frequencies. That is, ψu (2) (k) = ψu- v2) (k) + Δcpu + i (k) for u = l, L / Ml, where Δcpu (k) = <pu (k) - <pu-i (k) represents the original instantaneous frequencies for the original QMF block.
[00109] For successive synthesis blocks, the same phase modification rules are applied. That is, for the m-th new QMF block (m = 3, ..., L / M), its phases ψu (m) (k) are decided as shown below.

[00110] In the incorporation with the original block amplitude information, the new phases above result in new L / M blocks.
[00111] Here, in the HBE scheme in the present modality, in the phase manipulation stage, a different manipulation is performed, depending on a QMF subband index. Specifically, the above phase modification method can be designed differently for odd QMF subbands and even subbands, respectively.
[00112] And based on that, for a tonal signal, that its instantaneous frequency in QMF domain is associated with the phase difference, Aç (n, k) -cp (n, k) -cp (n-1, k) , in different ways.
[00113] In more detail, instantaneous frequency ω (n, k) can be determined using (Equation 6) below.
(Equation 6)
[00114] In (Equation 6), princ arg (a) means the main angle of a, defined by (Equation 7) below.
(Equation 7)
[00115] In the equation, mod (a, b) denotes the modulation of a by b.
[00116] As a result, for example, in the phase modification method above, the phase difference could be elaborated as in (Equation 8) below.
(Equation 8)
[00117] Furthermore, in the HBE scheme in the present modality, in the QMF coefficient generation stage, the new sets of QMF coefficients are superimposed to generate the QMF coefficients corresponding to a time-extended audio signal. Specifically, in order to reduce the amount of computation, the QMF synthesis operation is not applied directly to each new individual QMF block. Instead, it is applied to the results added with overlapping those new QMF blocks.
[00118] Note that, as the STFT-based distension method, the new QMF coefficients are optionally subjected to a synthesis window placement, prior to the superimposed addition. In the present modality, like the process of placing in an analysis window, placing in a synthetic window can be performed as shown below. Xín + 1) (u, k) = X (n + i) (u, k) • Ho (w), where w = mod (u, L / M)
[00119] Then, due to the additivity of the QMF transform, all new L / M blocks can be added with superposition, with the size of the jump of s time intervals, before the QMF synthesis. The results added with superposition Y (u, k) can be obtained through the equation below.
(Equation 9)
[00120] Here, n = 0, ..., L / M-l, u = l, ..., L / M, and k = 0, ..., M-l.
[00121] The final audio signal can be generated by applying QMF synthesis to Y (u, k), which corresponds to the original signal with a modified time scale.
[00122] Comparing the QMF-based strain method in the HBE scheme in the present modality with the prior art STFT-based strain method, it is valuable to note that the inherent QMF transform time resolution helps to significantly reduce the amount of computation, which can be obtained only with a series of STFT transforms in the STFT-based distension method of the prior art.
[00123] The computation quantity analysis below shows an approximate computation quantity comparison result when considering only the computation quantity of the transformed contribution.
[00124] Assuming that the amount of STFT computation of size L is log2 (L) -L and the amount of computation of a QMF analysis transform is almost twice that of an FFT transform, the amount of transform computation involved in the prior art HF spectrum generator will be approximately as shown below.
(Equation 10)
[00125] By comparison, the amount of transform computation involved in the HF spectrum generator in the present modality is approximate as shown in (Equation 11) below:
(Equation 11)
[00126] For example, assuming L = 1024 and Ra = 128, the comparison of computation quantity above can be seen in Table 1. [Table 1]
Table 1. Comparison of the amount of computation in Table 1 between HBE of the previous technique and the proposed HBE with the adoption of time extension based on QMF in this modality (Second Mode)
[00127] From this point on, a second modality of the HBE scheme (harmonic bandwidth extension method) and a decoder (audio decoder or audio decoding device) using the same must be described in detail here.
[00128] Note that, with the adoption of the QMF-based time stretching method, the HBE technology used in the QMF-based time stretching method has a much lower amount of computation. However, on the other hand, an adoption of the QMF-based time stretching method also poses two possible problems, which have risks of degradation of sound quality.
[00129] First, there is a problem of quality degradation for a high-order parcel. Assume that an HF spectrum is composed of (Tl) plots with corresponding strain factors such as 2, 3, ..., T. Because the QMF-based time stretch is block based, the reduced number of operations of addition with overlap in the high order plot causes a degradation in the distension effect.
[00130] Figure 10 is a diagram showing a sinusoidal tonal signal. The upper panel (a) shows the stretched effect of a 2nd order plot for a pure sine tonal signal, the stretched output is basically clean, with only a few other frequency components presented at small amplitudes. Meanwhile, the lower panel (b) shows the extended effect of a 4th order plot for the same sinusoidal tonal signal.
[00131] Comparing with (a), it can be seen that, although the central frequency is changed correctly in (b), the resulting output also includes some other frequency components with non-ignitable amplitude. This can result in the unwanted noise presented at the stretched output.
[00132] Second, there is a possible problem of quality degradation for transient signals. A quality degradation problem like this can have 3 potential sources of contribution.
[00133] The first source of contribution is that the transient component may be lost during resampling. Assuming a transient signal with a Dirac pulse located in an even sample, for the 4th order plot with a factor of 2 decimation, a Dirac pulse like that disappears in the resampled signal. As a result, the resulting HF spectrum has incomplete transient components.
[00134] The second source of contribution is the misaligned transient components among different plots. Due to the fact that the plots have a different resampling factor, a Dirac pulse located at a specific position can have several components located at different time intervals in the QMF domain.
[00135] Figure 11 is a diagram showing a misalignment and an energy spreading effect. For an entry with Dirac impulse (for example, in figure 11, presented as the 3rd sample, shown in gray), after resampling with different factors, its position is changed to different positions. As a result, the distended output shows a perceptually attenuated transient effect.
[00136] The third source of contribution is that the energies of transient components are spread non-uniformly among different plots. As shown in figure 11, with the 2nd order portion, the associated transient component is spread over the 5th and 6th samples; with the 3rd order parcel, for the 4th to 6th samples; and with the 4th order portion for the 5th to 8th samples. As a result, the stretched output has a weaker transient effect at the higher frequency. For some critical transient signals, the distended output still shows some irritating pre- and post-echo artifacts.
[00137] In order to overcome the above quality degradation problem, an improved HBE technology is desired. However, a solution that is too complicated also increases the amount of computation. In the present modality, a QMF-based step change method is used to avoid the possible quality degradation problem and keep the advantage of low computational quantity.
[00138] As described in detail below, in the HBE scheme (harmonic bandwidth extension method) in the present modality, the HF spectrum generator in the HBE technology in the present modality is designed with a time extension process and change of pace in the QMF domain. Furthermore, a decoder (audio decoder or audio decoding device) using HBE in the present modality must also be described below.
[00139] Figure 12 is a flow chart showing the method of extending bandwidth in the present modality.
[00140] This bandwidth extension method is a bandwidth extension method for producing a bandwidth signal from a low frequency bandwidth signal, the method including: a first step transforming low frequency bandwidth signal transformation into a quadrature mirror filter bank (QMF) domain to generate a first low frequency QMF spectrum; a step of generating a low-order harmonic portion of generating a low-order harmonic portion by time distancing the low frequency bandwidth signal in a QMF domain; a high frequency generation stage of (i) generation of signals that are changed pitch, by applying different coefficients of change to the low order harmonic portion, and (ii) generation of a high frequency QMF spectrum from the signals; a spectrum modification step of modifying the high frequency QMF spectrum to satisfy high frequency energy and hue conditions; and a full bandwidth generation step of generating the full bandwidth signal by combining the modified high frequency QMF spectrum with the first low frequency QMF spectrum.
[00141] It should be noted that the first transform step is performed by a T-F 1508 transform unit to be described later, the low order harmonic portion generation step is performed by a QMF 1503 transform, a unit time extension 1504, a QMF transform unit 601, and a phase speech encoder 603 to be described later. In addition, the high frequency generation step is performed by a step change unit 1506, bandpass units 604 and 605, frequency extension unit 606 and 607, and delay alignment units 608 to 610, to be described later. Furthermore, the spectrum modification step is performed by an HF 1507 post-processing unit to be described later, and the full bandwidth generation step is performed by an addition unit 1512.
[00142] Furthermore, the low-order harmonic plot generation step includes: a second low-frequency bandwidth transform transform step in a second low-frequency QMF spectrum; a bandpass pass step of the second low frequency QMF spectrum; and a stretch step of stretching the second low frequency band QMF spectrum passed over a temporal dimension.
[00143] It should be noted that the second transform step is performed by the QMF 601 transform unit and the QMF 1503 transform unit, the band pass step is performed by a 602 band pass unit to be discussed later, and the stretching step is carried out by the phase speech encoder 603 and the time stretching unit 1504.
[00144] Furthermore, the second low frequency QMF spectrum has a finer frequency resolution than the first low frequency QMF spectrum.
[00145] Furthermore, the high frequency generation step includes: a step generation of the band pass portion of the low order harmonic portion for the generation of past band portions; a high order generation step of mapping each of the past band plots to a high frequency for the generation of higher order harmonic plots; and a summation step of the high order harmonic plots with the low order harmonic plots.
[00146] It should be noted that the plot generation step is performed by the 604 and 605 bandpass units, the high order generation step is performed by the frequency extension units 606 and 607, and the summation step is performed by the addition unit 611 to be discussed later.
[00147] Figure 13 is a diagram showing the HF spectrum generator in the HBE scheme in the present modality. The HF spectrum generator includes the QMF transform unit 601, the bandpass units 602, 604, ..., and 605, and the phase speech encoder 603, the frequency extension unit 606, .. ., and 607, the delay alignment units 608, 609, ..., and 610, and the addition unit 611.
[00148] One of the LF bandwidth input is first transformed (601) in the QMF domain, its past band QMF spectrum (602) is extended in time (603) to double the length. The extended QMF spectrum has the band passed (604 to 605) for the production of limited band spectra (T-2). The resulting limited band spectra are translated (606 to 607) into higher frequency bandwidth spectra. These HF spectra have the delay aligned (608 to 610) to compensate for different potential delay contributions from the spectrum translation process and summed up (611) for the generation of the final HF spectrum. It should be noted that each of the numbers 601 to 611 in parentheses above denotes a constituent element of the HF spectrum generator.
[00149] Note that by comparison with the QMF transform (108 in figure 1), the QMF transform in the HBE scheme in the present modality (QMF 601 transform unit) has a finer frequency resolution, the decrease in resolution of time will be compensated by the successive stretching operation.
[00150] Comparing the HBE scheme in the present modality with the scheme of the previous technique (figure 2), it can be seen that the main differences are 1) as in the first modality, the time extension process is conducted in the domain of QMF, not in the FFT domain; 2) higher order parcels are generated based on the 2nd order parcel; 3) the step change process is also conducted in a QMF domain, not in the time domain.
[00151] Figure 14 is a diagram showing the decoder adopting the HF spectrum generator in the HBE scheme in the present modality. The decoder (audio decoding apparatus) includes a demultiplexing unit 1501, a decoding unit 1502, the QMF transform unit 1503, the time stretching unit 1504, a delay alignment unit 1505, the changing unit step 1506, the HF post processing unit 1507, the TF transform unit 1508, a delay alignment unit 1509, a TF reverse transform unit 1510 and an addition unit 1511. It should be noted that at present In this embodiment, the demultiplexing unit 1501 corresponds to the separation unit which separates an encoded low frequency bandwidth signal from an encoded information (bit stream). Furthermore, the inverse T-F transform unit 1510 corresponds to the inverse transform unit which transforms a full bandwidth signal from a quadrature mirror filter bank (QMF) domain signal to a time domain signal.
[00152] With the decoder, the bit stream is demultiplexed (1501) first, the signal LF part is then decoded (1502). To approximate the original HF part, the decoded LF part (low frequency bandwidth signal) is transformed (1503) into the QMF domain to generate the LF QMF spectrum. The resulting LF QMF spectrum is stretched (1504) along the temporal direction to generate a low order HF plot. The low-order HF portion has changed pitch (1506) to generate higher-order plots. The resulting high order plots are combined with the delayed low order HF plot (1505) for the generation of the HF spectrum, the HF spectrum is further refined (1507) by post-processing, under the guidance of some parameters decoded HF. Meanwhile, the decoded LF part is also transformed (1508) into a QMF domain. In the end, the refined HF spectrum is combined with the delayed LF spectrum (1509) to produce a full bandwidth QMF spectrum (1512). The resulting full bandwidth QMF spectrum is converted (1510) back to the time domain for the extraction of the decoded broadband audio signal. It should be noted that each of the numbers 1501 to 1512 denotes a constituent element of the decoder. The step change method
[00153] A step change algorithm based on QMF (frequency extension method in QMF domain) for step change unit 1506 in the HBE scheme in the present modality is designed by decomposing the LM QMF sub-bands in plural sub-bands, transposing those sub-bands into HF sub-bands, and combining the resulting HF sub-bands to generate an HF spectrum. Specifically, the high-order generation step includes: a step of dividing the division of each sub-band of QMF into each of the band portions passed into multiple sub-bands; a mapping step of mapping the sub-bands into high-frequency QMF sub-bands; and a combining step of combining results from the sub-band mapping.
[00154] It should be noted that the division step corresponds to step 1 (901 to 903) to be described later, the mapping step corresponding to steps 2 and 3 (904 to 909) to be described later, and the step combination corresponds to step 4 (910) to be described later.
[00155] Figure 15 is a diagram showing a step change algorithm based on QMF. Given a bandwidth space of the 2nd order plot, the HF spectrum of a tth order plot (t> 2) can be reconstructed by: 1) decomposition (step 1: 901 to 903) of the given LF spectrum , that is, each QMF sub-band within the LF spectrum is decomposed into multiple QMF sub-bands; 2) scaling (step 2: 904 to 906) of the central frequencies of those sub-bands with a factor of t / 2; 3) mapping (step 3: 907 to 909) of those sub-bands in HF sub-bands; 4) summation of all sub-bands mapped for the formation of HF sub-bands (step 4: 910).
[00156] For step 1, a few methods are available for the decomposition of a subband of QMF into multiple sub-bands in order to obtain a better frequency resolution. For example, the so-called M-band filters that are adopted in an MPEG surround decoder encoder. In this preferred embodiment of the invention, subband decomposition is performed by applying an additional set of exponentially modulated filter bank, defined by (Equation 12) below.
(Equation 12)
[00157] Here, q = -Q, -Q + l ,. „, 0, 1, ..., Ql and n = 0, 1, ..., N (where no is an integer constant, N is the order filter bank).
[00158] By adopting the above filter bank, a given subband signal, say the k-th subband signal x (n, k) is decomposed into 2Q subband signals according to (Equation 13) below.

[00159] Here, q = -Q, -Q + l, ..., 0, 1, ..., Q-l. In the equation, 'conv (.)' Denotes the convolution function.
[00160] With an additional complex transform like this, the frequency spectrum of a subband is further divided into 2Q subfrequency spectrum. From the frequency resolution point of view, if the QMF transform has M bands, its associated subband frequency resolution will be n / M and its subband frequency resolution will be refined to π / (2Q-M) . In addition, the general system shown in (Equation 14) is time-invariant, that is, free of aliasing, despite the use of decreased sampling and increased sampling.
(Equation 14)
[00161] Note that the additional filter bank above is stacked oddly (the factor q + 0.5), which means that there are no sub-bands centered around the DC value. Instead, for an even Q number, the center frequencies of the sub-bands are symmetrical around zero.
[00162] Figure 16 is a graph showing a distribution of sub-band spectra. Specifically, figure 16 shows a filter bank spectrum distribution like this for the case of Q = 6. The purpose of the odd-shaped stack is to facilitate the subsequent sub-band combination.
[00163] For step 2, the scaling of the frequencies can be simplified by considering the oversampling characteristics of the complex QMF transform.
[00164] Note that, in the complex QMF domain, as the adjacent subband bands pass over each other, a frequency component of the overlap zone would appear in both sub bands (see Publication of Application for International Patent No. WO 2006048814).
[00165] As a result, frequency scaling can be simplified to half the amount of computation just by calculating the frequencies for those sub-bands residing in the pass band, that is, the positive frequency part for an even sub-band or a negative frequency part for an odd sub-band.
[00166] In more detail, the kLF-th sub-band is divided into 2Q sub-bands. In other words, x (n, ELF) is divided as shown in (Equation 15) below.
(Equation 15)
[00167] Subsequently, in order to produce the t-th portion, the center frequencies of those sub-bands are staggered using (Equation 16) below.
(Equation 16)
[00168] Here, q = -Q, -Q + l, ..., -1 when kLF is odd, or q = 0, 1, ..., Q-l when FLF is even.
[00169] For step 3, a mapping of the sub-bands in an HF sub-band also needs to take into account the characteristics of a complex QMF transform. In the present modality, this mapping process is carried out in two stages, the first being directly mapping all the sub-bands in the pass band in an HF sub-band; the second, based on the mapping result above, is the mapping of all sub-bands in the stop band to the HF sub-band. Specifically, the mapping step includes: a step of dividing the sub-bands of each QMF sub-band into a part of band and a part of bandpass; a computation step of computation of the center frequencies transposed from the sub-bands in the bandpass part with a factor dependent on parcel order; a first stage of mapping the mapping of the sub-bands in the band-pass part in high-frequency QMF sub-bands, according to the center frequencies; and a second stage of mapping the mapping of sub-bands in the band to sub-bands of high frequency QMF, according to the sub-bands of the band-pass part.
[00170] To understand this point, it is advantageous to review what relationship exists for a pair of positive frequency and negative frequency for the same signal component and its associated subband indices.
[00171] As mentioned earlier, in the complex QMF domain, a sinusoidal spectrum has a positive and a negative frequency. Specifically, the sinusoidal spectrum has one of those frequencies in the passband of a subband of QMF and the other of the frequencies in the stopband of an adjacent subband. Considering that the QMF transform is an oddly stacked transform, a pair of signal components like this can be illustrated in figure 17.
[00172] Figure 17 is a diagram showing the relationship between the bandpass component and the bandwidth component for a sine wave in the complex QMF domain.
[00173] Here, the gray area denotes the stop band of a sub-band. For an arbitrary sinusoidal signal (in continuous line) in the passband of a subband, its serrated part (in dashed line) is located in the stop band of the adjacent subband (the two paired frequency components are associated by a line with double arrows).
[00174] A sinusoidal signal with frequency fo is as shown in (Equation 17) below.
(Equation 17)
[00175] The passband component of the sinusoidal signal with the frequency f described above resides in the k-th subband if (Equation 18) below is satisfied.
(Equation 18)
[00176] In addition, its para-band component resides in the k ~ th sub-band if (Equation 19) below is satisfied.
(Equation 19)
[00177] If the sub-band is decomposed into 2Q sub-bands, the above relationship will be elaborated with a higher frequency resolution, as shown in (Equation 20) below.
(Equation 20)
[00178] Therefore, in the present modality, in order to map the sub-bands in the stop band in an HF sub-band, it is necessary to associate them with the mapping results for those sub-bands in the pass band. The motivation for this operation is to ensure that the frequency pairs for LF components are still in pairs when they are switched up to HF components.
[00179] For this purpose, in the first place, it is straightforward to map the sub-bands in the pass band to the HF sub-band. Considering the center frequencies of sub-bands staggered in frequency and the frequency resolution of the QMF transform, the mapping function can be described by m (k, q), as shown in (Equation 21) below.
(Equation 21)
[00180] Here, q = -Q, -Q + l, ..., -1 if kLF is odd, or q = 0, 1, ..., Q-1 if kLF is even. Here, the coefficient shown in (Equation 22) below denotes a rounding operation to obtain the integers closest to x in the direction of minus infinity.
(Equation 22)
[00181] In addition, due to scaling upwards (t / 2> l), it is possible that an HF sub-band has plural sub-band mapping sources. That is, it is possible that m (k, qi) = m (k, q2) or m (ki, qi) = m (k2, q2). Therefore, an HF subband could be a combination of multiple LF subband sub-bands, as shown in (Equation 23)
(Equation 23)
[00182] Here, q = -Q, -Q + l, ..., -1 if kLF is odd, or q = 0, 1, ..., Q-l if kLr is even.
[00183] Second, following the aforementioned relationship between frequency pairs and subband indices, the mapping function for those subbands in the stop band can be established as follows.
[00184] Considering a sub-band of LF ELF, the mapping functions of the sub-bands in its band are already decided by the 1st stage as: m (kLF, -Q), m (kLF, -Q + l), ..., m (kLF, -l) for the odd kLF in (kíF, 0), m (kLF, 1) m (kLF, Ql) for the even kLF, then the part of for bandwidth to bandwidth can be mapped according to e (Equation 24) below:
(Equation 24)
[00185] Here, 'condition a' refers to when kLF is even and (Equation 25) below is even, or when kLF is odd and (Equation 26) below is even.

[00186] In addition (Equation 27) below denotes a rounding operation to obtain the integers closest to x in the direction of minus infinity.

[00187] The resulting HF sub-band is the combination of all associated LF sub-bands, as shown in (Equation 28) below.
Equation 28)
[00188] Here, q = -Q, -Q + l, ..., -1 if kLr is even, or q = 0, 1, ..., Q-1 if kLF is odd.
[00189] At the end, all mapping results in the passband and stopband are combined to form the HF subband, as shown in (Equation 29) below.
(Equation 29)
[00190] Note that the above step change method in the QMF domain benefits a high frequency quality degradation and the problem of handling possible transients.
[00191] First, all plots now have the same strain factor, the smallest, which greatly reduces high frequency noise (coming from those incorrect signal components generated during a time strain). Second, all sources of contribution to transient degradation are avoided. That is, there is no time domain resampling process; the same strain factors are used for all plots, which inherently eliminates the possibility of misalignment.
[00192] In addition, it should be noted that the present modality has some negative aspect in the frequency resolution. Note that due to the adoption of sub-band filtration, the frequency resolution is increased from n / M to n / (2Q-M), but it is still coarser than the fine frequency resolution of time domain resampling. (n / L). However, considering that the human ear is less sensitive to a high frequency signal component, the result of the changed pitch produced by the present modality proved not to be perceptually different from that produced by the resampling method.
[00193] Apart from that above, in a comparison with the HBE scheme in the first modality, the HBE scheme in the present modality also provides a bonus with an additional reduced amount of computation, because only a low order installment needs an operation of time stretching.
[00194] Again, a reduction in the amount of computation like this can be analyzed in a crude way by considering the amount of computation with contribution from transformed.
[00195] Following the assumptions in the computation quantity analysis mentioned above, the amount of transform computation involved in the HF spectrum generator in the present modality is approximately as shown below.

[00196] Therefore, Table 1 can be updated as follows. [Table 2]
Table 2. Comparison of the amount of computing between HBE in the present modality and the HBE scheme in the first modality
[00197] The present invention is a new HBE technology for low bit rate audio encoding. Using this technology, a broadband signal can be reconstructed based on a low frequency bandwidth signal by generating its high frequency (HF) part through a time stretch and frequency extension of the part low frequency (LF) in the QMF domain. In comparison with the prior art HBE technology, the present invention provides comparable sound quality and a much lower computation count. Such a technology can be used in applications such as mobile telephony, teleconferencing, etc., in which an audio encoder - decoder operates at a low bit rate with a low amount of computing.
[00198] It should be noted that each of the function blocks in the block diagrams (figures 6, 7, 13, 14 and so on) is typically performed as an LSI, which is an integrated circuit. The function blocks can be realized as separate individual chips, or as a single chip for the inclusion of part or all of it.
[00199] Although an LSI is referred to here, there are instances where the designations IC, system LSI superLSI, ultraLSI are used, due to the difference in the degree of integration.
[00200] In addition, the means for circuit integration is not limited to an LSI, and an implementation with a dedicated circuit or general purpose processor is also available. It is also acceptable to use a field programmable port arrangement (FPGA) that allows for programming after the LSI has been manufactured, and a reconfigurable processor in which the circuit cell connections and settings at the LSI are reconfigurable.
[00201] Furthermore, if an integrated circuit technology that replaces LSI appears through the progress in semiconductor technology or other derived technology, that technology can naturally be used to carry out an integration of function blocks.
[00202] Furthermore, among the respective function blocks, the unit which stores data to be encoded or decoded can be made a separate structure, without being included in the single chip. [Industrial Applicability]
[00203] The present invention relates to a new harmonic bandwidth extension (HBE) technology for low bit rate audio encoding. With technology, a broadband signal can be reconstructed based on a low frequency bandwidth signal by generating its high frequency part (HF) through a time extension and frequency extension of the frequency part (LF) in the QMF domain. In comparison with the prior art HBE technology, the present invention provides comparable sound quality and a much smaller amount of computing. Such a technology can be used in applications such as mobile telephony, teleconferencing, etc., in which an audio encoder - decoder operates at a low bit rate with a low amount of computing. [Reference Symbol List] 501-503, 602, 604, 605 bandpass unit 504-506 sampling unit 507-509, 601, 1404, 1505 QMF transform unit 510-512, 603 phase voice encoder 513-515, 608-610, 1407, 1505, 1509 delay alignment unit 516, 611, 1410, 1511, 1512 addition unit 606, 607 frequency extension unit 1401, 1501 demultiplexing unit 1402, 1502 decoding unit 1403 time resampling unit 1405, 1504 time extension unit 1406, 1508 TF transform unit 1409, 1510 reverse TF transform unit 1506 step change unit

权利要求:
Claims (8)
[0001]
1. Method of extending bandwidth to produce a full bandwidth signal from a low frequency bandwidth signal, said method characterized by comprising: a first transform step (S21) that transforms the low frequency bandwidth signal in a quadrature mirror filter bank (QMF) domain for the generation of a first low frequency QMF spectrum; a low-order harmonic portion generation step (S22) that generates a low-order harmonic portion by a time stretch of the low frequency bandwidth signal in a QMF domain; a high frequency generation step (S23) of (i) generation of signals that are changed pitch, by applying different coefficients of change to the low order harmonic portion, and (ii) generation of a high frequency QMF spectrum from the signs; a spectrum modification step (S24) that modifies the high frequency QMF spectrum to satisfy high frequency energy and hue conditions; and a full bandwidth generation step (S25) that generates the full bandwidth signal by combining the modified high frequency QMF spectrum with the first low frequency QMF spectrum.
[0002]
2. Bandwidth extension method, according to claim 1, characterized by the fact that the aforementioned low order harmonic portion generation step (S22) includes: a second transform step that transforms the frequency bandwidth signal low in a second low frequency QMF spectrum, where the second low frequency QMF spectrum has finer frequency resolution than the first low frequency QMF spectrum.
[0003]
3. Bandwidth extension method, according to claim 1 or 2, characterized by the fact that the referred high-frequency generation step (S23) includes: a step-generation portion of the bandwidth portion of the harmonic portion of order low for the generation of past band portions; a high-order generation step of mapping each of the high frequency band plots to generate high-order harmonic plots; and a summation stage that adds the high order harmonic plots to the low order harmonic plots.
[0004]
4. Bandwidth extension method, according to claim 3, characterized by the fact that the mentioned high-order generation step includes: a division step that divides each QMF sub-band into each of the band portions passed in multiple sub-bands; a mapping step that maps the sub-bands to high-frequency QMF sub-bands; and a combination step that combines results from the subband mapping.
[0005]
5. Bandwidth extension method, according to claim 4, characterized by the fact that said mapping step includes: a division step that divides the sub-bands of each of the QMF sub-bands into a part of para band and a band pass part; a frequency computation step that computes the transposed center frequencies of the sub-bands in the bandpass part with a factor dependent on the order of the plot; a first mapping step that maps the sub-bands in the bandpass part to high-frequency QMF sub-bands according to the center frequencies; and a second mapping step that maps the sub-bands in the band to high-frequency QMF sub-bands, according to the sub-bands of the band-pass part.
[0006]
6. Bandwidth extension apparatus which produces a full bandwidth signal from a low frequency bandwidth signal, said bandwidth extension apparatus characterized by comprising: a first transform unit ( 1503) configured to transform the low frequency bandwidth signal into a quadrature mirror filter bank (QMF) domain for the generation of a first low frequency QMF spectrum; a low order harmonic plot generation unit (1504) configured for the generation of a low order harmonic plot by a time extension of the low frequency bandwidth signal in a QMF domain; a high frequency generation unit (1506) configured for (i) the generation of signals that are changed pitch, by applying different coefficients of change to the low order harmonic portion, and (ii) the generation of a QMF spectrum high frequency from the signals; a spectrum modification unit (1507) configured for modifying the high frequency QMF spectrum to satisfy high frequency energy and hue conditions; and a full bandwidth generation unit (1512) configured for generating the full bandwidth signal by combining the modified high frequency QMF spectrum with the first low frequency QMF spectrum.
[0007]
7. Integrated circuit, characterized by producing a full bandwidth signal from a low frequency bandwidth signal, said circuit comprising: a first transformer unit (1503) configured for the transformation of the signal width low frequency band in a quadrature mirror filter bank (QMF) domain for generating a first low frequency QMF spectrum; a low order harmonic plot generation unit (1504) configured for the generation of a low order harmonic plot by a time extension of the low frequency bandwidth signal in a QMF domain; a high frequency generation unit (1506) configured for (i) the generation of signals that are changed pitch, by applying different coefficients of change to the low order harmonic portion, and (ii) the generation of a QMF spectrum high frequency from the signals; a spectrum modification unit (1507) configured for modifying the high frequency QMF spectrum to satisfy high frequency energy and hue conditions; and a full bandwidth generation unit (1512) configured for generating the full bandwidth signal by combining the modified high frequency QMF spectrum with the first low frequency QMF spectrum.
[0008]
8. Audio decoding apparatus, characterized by comprising: a separation unit (1501) configured for separating a low frequency coded bandwidth signal from an encoded information; a decoding unit (1502) configured for decoding the encoded low frequency bandwidth signal; a transform unit (1503) configured to transform the low frequency bandwidth signal generated by decoding by said decoding unit (1502), into a quadrature mirror filter bank (QMF) domain for generating a low frequency QMF spectrum; a low order harmonic plot generation unit (1504) configured for the generation of a low order harmonic plot by a time extension of the low frequency bandwidth signal in a QMF domain; a high frequency generation unit (1506) configured for (i) the generation of signals that are changed pitch, by applying different coefficients of change to the low order harmonic portion, and (ii) the generation of a QMF spectrum high frequency from the signals; a spectrum modification unit (1507) configured to modify the high frequency QMF spectrum to satisfy high frequency energy and hue conditions; a full bandwidth generating unit (1512) configured to generate the full bandwidth signal by combining the modified high frequency QMF spectrum with the low frequency QMF spectrum; and an inverse transform unit (1510) configured for transforming the full bandwidth signal of a quadrature mirror filter bank (QMF) domain signal into a time domain signal.

类似技术:

公开号 | 公开日 | 专利标题

BR112012002839B1|2020-10-13|bandwidth extension method, bandwidth extension device, integrated circuit and audio decoding device

CA2966469C|2020-05-05|Improved harmonic transposition

AU2021204779A1|2021-08-05|Improved Harmonic Transposition

US20210383817A1|2021-12-09|Harmonic Transposition in an Audio Coding Method and System

AU2013211560B2|2016-04-28|Improved harmonic transposition

BR112012022574A2|2021-09-21|APPARATUS AND METHOD FOR PROCESSING AN INPUT AUDIO SIGNAL USING CASCADED FILTER BANKS

同族专利:

公开号 | 公开日

AU2011263191B2|2016-06-16|

EP2581905A4|2014-11-05|

SG178320A1|2012-03-29|

AR082764A1|2013-01-09|

MX2012001696A|2012-02-22|

US20120136670A1|2012-05-31|

WO2011155170A1|2011-12-15|

KR101773631B1|2017-08-31|

JP5243620B2|2013-07-24|

US10566001B2|2020-02-18|

US9799342B2|2017-10-24|

EP3001419A1|2016-03-30|

US9093080B2|2015-07-28|

BR112012002839A2|2017-02-14|

CA2770287C|2017-12-12|

ZA201200919B|2013-07-31|

TWI545557B|2016-08-11|

TW201207840A|2012-02-16|

KR20130042460A|2013-04-26|

RU2012104234A|2014-07-20|

ES2565959T3|2016-04-07|

PL2581905T3|2016-06-30|

CN102473417A|2012-05-23|

EP3001419B1|2020-01-22|

EP2581905A1|2013-04-17|

RU2582061C2|2016-04-20|

US20200135217A1|2020-04-30|

JPWO2011155170A1|2013-08-01|

EP2581905B1|2016-01-06|

AU2011263191A1|2012-03-01|

US20150248894A1|2015-09-03|

MY176904A|2020-08-26|

CA2770287A1|2011-12-15|

US20170358307A1|2017-12-14|

JP2013084018A|2013-05-09|

JP5750464B2|2015-07-22|

CN102473417B|2015-04-08|

HUE028738T2|2017-01-30|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

DE3785189T2|1987-04-22|1993-10-07|Ibm|Method and device for changing speech speed.|

SE512719C2|1997-06-10|2000-05-02|Lars Gustaf Liljeryd|A method and apparatus for reducing data flow based on harmonic bandwidth expansion|

US20030187663A1|2002-03-28|2003-10-02|Truman Michael Mead|Broadband frequency translation for high frequency regeneration|

DE60327039D1|2002-07-19|2009-05-20|Nec Corp|AUDIO DEODICATION DEVICE, DECODING METHOD AND PROGRAM|

JP4380174B2|2003-02-27|2009-12-09|沖電気工業株式会社|Band correction device|

KR101217649B1|2003-10-30|2013-01-02|돌비 인터네셔널 에이비|audio signal encoding or decoding|

CN101496436B|2004-04-15|2011-08-17|高通股份有限公司|Multi-carrier communications methods and apparatus|

KR101187597B1|2004-11-02|2012-10-12|돌비 인터네셔널 에이비|Encoding and decoding of audio signals using complex-valued filter banks|

US8577686B2|2005-05-26|2013-11-05|Lg Electronics Inc.|Method and apparatus for decoding an audio signal|

JP5118022B2|2005-05-26|2013-01-16|エルジーエレクトロニクスインコーポレイティド|Audio signal encoding / decoding method and encoding / decoding device|

DE102005032724B4|2005-07-13|2009-10-08|Siemens Ag|Method and device for artificially expanding the bandwidth of speech signals|

KR101171098B1|2005-07-22|2012-08-20|삼성전자주식회사|Scalable speech coding/decoding methods and apparatus using mixed structure|

MX2008001307A|2005-07-29|2008-03-19|Lg Electronics Inc|Method for signaling of splitting information.|

US20080221907A1|2005-09-14|2008-09-11|Lg Electronics, Inc.|Method and Apparatus for Decoding an Audio Signal|

US20080235006A1|2006-08-18|2008-09-25|Lg Electronics, Inc.|Method and Apparatus for Decoding an Audio Signal|

EP1946297B1|2005-09-14|2017-03-08|LG Electronics Inc.|Method and apparatus for decoding an audio signal|

KR100958144B1|2005-11-04|2010-05-18|노키아 코포레이션|Audio Compression|

EP1974344A4|2006-01-19|2011-06-08|Lg Electronics Inc|Method and apparatus for decoding a signal|

CN101361120B|2006-01-19|2011-09-07|Lg电子株式会社|Method and apparatus for processing a media signal|

EP1979897B1|2006-01-19|2013-08-21|LG Electronics Inc.|Method and apparatus for processing a media signal|

KR20080071971A|2006-03-30|2008-08-05|엘지전자 주식회사|Apparatus for processing media signal and method thereof|

JP2007272059A|2006-03-31|2007-10-18|Sony Corp|Audio signal processing apparatus, audio signal processing method, program and recording medium|

EP2054879B1|2006-08-15|2010-01-20|Broadcom Corporation|Re-phasing of decoder states after packet loss|

US9653088B2|2007-06-13|2017-05-16|Qualcomm Incorporated|Systems, methods, and apparatus for signal encoding using pitch-regularizing and non-pitch-regularizing coding|

US8688441B2|2007-11-29|2014-04-01|Motorola Mobility Llc|Method and apparatus to facilitate provision and use of an energy value to determine a spectral envelope shape for out-of-signal bandwidth content|

DE102008015702B4|2008-01-31|2010-03-11|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus and method for bandwidth expansion of an audio signal|

EP2104096B1|2008-03-20|2020-05-06|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus and method for converting an audio signal into a parameterized representation, apparatus and method for modifying a parameterized representation, apparatus and method for synthesizing a parameterized representation of an audio signal|

WO2010028292A1|2008-09-06|2010-03-11|Huawei Technologies Co., Ltd.|Adaptive frequency prediction|

CA2966469C|2009-01-28|2020-05-05|Dolby International Ab|Improved harmonic transposition|

CO6440537A2|2009-04-09|2012-05-15|Fraunhofer Ges Forschung|APPARATUS AND METHOD TO GENERATE A SYNTHESIS AUDIO SIGNAL AND TO CODIFY AN AUDIO SIGNAL|

EP2239732A1|2009-04-09|2010-10-13|Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V.|Apparatus and method for generating a synthesis audio signal and for encoding an audio signal|

TWI556227B|2009-05-27|2016-11-01|杜比國際公司|Systems and methods for generating a high frequency component of a signal from a low frequency component of the signal, a set-top box, a computer program product and storage medium thereof|

ES2400661T3|2009-06-29|2013-04-11|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Encoding and decoding bandwidth extension|

WO2011110500A1|2010-03-09|2011-09-15|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus and method for processing an input audio signal using cascaded filterbanks|JP5339919B2|2006-12-15|2013-11-13|パナソニック株式会社|Encoding device, decoding device and methods thereof|

KR101424944B1|2008-12-15|2014-08-01|프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베.|Audio encoder and bandwidth extension decoder|

JP5762620B2|2011-03-28|2015-08-12|ドルビーラボラトリーズライセンシングコーポレイション|Reduced complexity conversion for low frequency effects channels|

ES2568640T3|2012-02-23|2016-05-03|Dolby International Ab|Procedures and systems to efficiently recover high frequency audio content|

PL2831875T3|2012-03-29|2016-05-31|Ericsson Telefon Ab L M|Bandwidth extension of harmonic audio signal|

EP2682941A1|2012-07-02|2014-01-08|Technische Universität Ilmenau|Device, method and computer program for freely selectable frequency shifts in the sub-band domain|

EP2709106A1|2012-09-17|2014-03-19|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Apparatus and method for generating a bandwidth extended signal from a bandwidth limited audio signal|

EP2717261A1|2012-10-05|2014-04-09|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Encoder, decoder and methods for backward compatible multi-resolution spatial-audio-object-coding|

KR20140075466A|2012-12-11|2014-06-19|삼성전자주식회사|Encoding and decoding method of audio signal, and encoding and decoding apparatus of audio signal|

EP2784775B1|2013-03-27|2016-09-14|Binauric SE|Speech signal encoding/decoding method and apparatus|

CN111477245A|2013-06-11|2020-07-31|弗朗霍弗应用研究促进协会|Speech signal decoding device and speech signal encoding device|

EP2830059A1|2013-07-22|2015-01-28|Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.|Noise filling energy adjustment|

JP6531103B2|2013-09-12|2019-06-12|ドルビー・インターナショナル・アーベー|QMF based processing data time alignment|

EP3063761B1|2013-10-31|2017-11-22|Fraunhofer Gesellschaft zur Förderung der angewandten Forschung E.V.|Audio bandwidth extension by insertion of temporal pre-shaped noise in frequency domain|

CN113257268B|2021-07-02|2021-09-17|成都启英泰伦科技有限公司|Noise reduction and single-frequency interference suppression method combining frequency tracking and frequency spectrum correction|

法律状态:
2017-10-10| B25A| Requested transfer of rights approved|Owner name: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AME |

2018-03-27| B15K| Others concerning applications: alteration of classification|Ipc: G10L 19/02 (2013.01), G10L 21/038 (2013.01), G10L |

2018-12-26| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|

2020-03-17| B06A| Patent application procedure suspended [chapter 6.1 patent gazette]|

2020-08-25| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|

2020-10-13| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 06/06/2011, OBSERVADAS AS CONDICOES LEGAIS. |

优先权:

申请号 | 申请日 | 专利标题

JP2010-132205|2010-06-09|

JP2010132205|2010-06-09|

PCT/JP2011/003168|WO2011155170A1|2010-06-09|2011-06-06|Band enhancement method, band enhancement apparatus, program, integrated circuit and audio decoder apparatus|

[返回顶部]